time efficient kernel density estimation
Space and Time Efficient Kernel Density Estimation in High Dimensions
Recently, Charikar and Siminelakis (2017) presented a framework for kernel density estimation in provably sublinear query time, for kernels that possess a certain hashing-based property. However, their data structure requires a significantly increased super-linear storage space, as well as super-linear preprocessing time. These limitations inhibit the practical applicability of their approach on large datasets. In this work, we present an improvement to their framework that retains the same query time, while requiring only linear space and linear preprocessing time. We instantiate our framework with the Laplacian and Exponential kernels, two popular kernels which possess the aforementioned property. Our experiments on various datasets verify that our approach attains accuracy and query time similar to Charikar and Siminelakis (2017), with significantly improved space and preprocessing time.
- Asia > Afghanistan > Parwan Province > Charikar (0.05)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Asia > Afghanistan > Parwan Province > Charikar (0.05)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > Canada (0.04)
Reviews: Space and Time Efficient Kernel Density Estimation in High Dimensions
Overall the paper is an average paper but clearly written. This paper proposes an improvement of Charikar's approach to achieve sublinear kernel density estimation with linear space and linear time preprocessing. Experimental results focus mainly on Laplacian (L1 variant in the main submission and L2 variant added in supplement). The key observation for achieving linear space is to modify the previous HBE approach so that each hash table stores each point in the dataset with constant probability - in this way, the superlinear storage cost is overcome. However, my main complaint is in the experimental results.
Reviews: Space and Time Efficient Kernel Density Estimation in High Dimensions
After a careful discussion among the reviewers, there is a clear consensus that the paper provides a solid theoretical contributions for kernel density estimation in high dimensions. Hence, I am happy to recommend to accept this paper for publication at NeurIPS2019. Nevertheless, one concern that came up during the discussion is the lack of experimental comparison and clarity in how it was performed. I urge the authors to incorporate reviewers' comment on this weakness of the paper into the camera-ready version.
Space and Time Efficient Kernel Density Estimation in High Dimensions
Recently, Charikar and Siminelakis (2017) presented a framework for kernel density estimation in provably sublinear query time, for kernels that possess a certain hashing-based property. However, their data structure requires a significantly increased super-linear storage space, as well as super-linear preprocessing time. These limitations inhibit the practical applicability of their approach on large datasets. In this work, we present an improvement to their framework that retains the same query time, while requiring only linear space and linear preprocessing time. We instantiate our framework with the Laplacian and Exponential kernels, two popular kernels which possess the aforementioned property. Our experiments on various datasets verify that our approach attains accuracy and query time similar to Charikar and Siminelakis (2017), with significantly improved space and preprocessing time.
Space and Time Efficient Kernel Density Estimation in High Dimensions
Backurs, Arturs, Indyk, Piotr, Wagner, Tal
Recently, Charikar and Siminelakis (2017) presented a framework for kernel density estimation in provably sublinear query time, for kernels that possess a certain hashing-based property. However, their data structure requires a significantly increased super-linear storage space, as well as super-linear preprocessing time. These limitations inhibit the practical applicability of their approach on large datasets. In this work, we present an improvement to their framework that retains the same query time, while requiring only linear space and linear preprocessing time. We instantiate our framework with the Laplacian and Exponential kernels, two popular kernels which possess the aforementioned property.